54 research outputs found

    Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana

    Get PDF
    We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ~32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene

    A call for benchmarking transposable element annotation methods.

    Get PDF
    International audienceDNA derived from transposable elements (TEs) constitutes large parts of the genomes of complex eukaryotes, with major impacts not only on genomic research but also on how organisms evolve and function. Although a variety of methods and tools have been developed to detect and annotate TEs, there are as yet no standard benchmarks-that is, no standard way to measure or compare their accuracy. This lack of accuracy assessment calls into question conclusions from a wide range of research that depends explicitly or implicitly on TE annotation. In the absence of standard benchmarks, toolmakers are impeded in improving their tools, annotators cannot properly assess which tools might best suit their needs, and downstream researchers cannot judge how accuracy limitations might impact their studies. We therefore propose that the TE research community create and adopt standard TE annotation benchmarks, and we call for other researchers to join the authors in making this long-overdue effort a success

    Huntington's disease blood and brain show a common gene expression pattern and share an immune signature with Alzheimer's disease

    Get PDF
    There is widespread transcriptional dysregulation in Huntington's disease (HD) brain, but analysis is inevitably limited by advanced disease and postmortem changes. However, mutant HTT is ubiquitously expressed and acts systemically, meaning blood, which is readily available and contains cells that are dysfunctional in HD, could act as a surrogate for brain tissue. We conducted an RNA-Seq transcriptomic analysis using whole blood from two HD cohorts, and performed gene set enrichment analysis using public databases and weighted correlation network analysis modules from HD and control brain datasets. We identified dysregulated gene sets in blood that replicated in the independent cohorts, correlated with disease severity, corresponded to the most significantly dysregulated modules in the HD caudate, the most prominently affected brain region, and significantly overlapped with the transcriptional signature of HD myeloid cells. High-throughput sequencing technologies and use of gene sets likely surmounted the limitations of previously inconsistent HD blood expression studies. Our results suggest transcription is disrupted in peripheral cells in HD through mechanisms that parallel those in brain. Immune upregulation in HD overlapped with Alzheimer's disease, suggesting a common pathogenic mechanism involving macrophage phagocytosis and microglial synaptic pruning, and raises the potential for shared therapeutic approaches

    Genome Stability of Lyme Disease Spirochetes: Comparative Genomics of Borrelia burgdorferi Plasmids

    Get PDF
    Lyme disease is the most common tick-borne human illness in North America. In order to understand the molecular pathogenesis, natural diversity, population structure and epizootic spread of the North American Lyme agent, Borrelia burgdorferi sensu stricto, a much better understanding of the natural diversity of its genome will be required. Towards this end we present a comparative analysis of the nucleotide sequences of the numerous plasmids of B. burgdorferi isolates B31, N40, JD1 and 297. These strains were chosen because they include the three most commonly studied laboratory strains, and because they represent different major genetic lineages and so are informative regarding the genetic diversity and evolution of this organism. A unique feature of Borrelia genomes is that they carry a large number of linear and circular plasmids, and this work shows that strains N40, JD1, 297 and B31 carry related but non-identical sets of 16, 20, 19 and 21 plasmids, respectively, that comprise 33–40% of their genomes. We deduce that there are at least 28 plasmid compatibility types among the four strains. The B. burgdorferi ∼900 Kbp linear chromosomes are evolutionarily exceptionally stable, except for a short ≤20 Kbp plasmid-like section at the right end. A few of the plasmids, including the linear lp54 and circular cp26, are also very stable. We show here that the other plasmids, especially the linear ones, are considerably more variable. Nearly all of the linear plasmids have undergone one or more substantial inter-plasmid rearrangements since their last common ancestor. In spite of these rearrangements and differences in plasmid contents, the overall gene complement of the different isolates has remained relatively constant

    Vesiclepedia: A compendium for extracellular vesicles with continuous community annotation

    Get PDF
    Extracellular vesicles (EVs) are membraneous vesicles released by a variety of cells into their microenvironment. Recent studies have elucidated the role of EVs in intercellular communication, pathogenesis, drug, vaccine and gene-vector delivery, and as possible reservoirs of biomarkers. These findings have generated immense interest, along with an exponential increase in molecular data pertaining to EVs. Here, we describe Vesiclepedia, a manually curated compendium of molecular data (lipid, RNA, and protein) identified in different classes of EVs from more than 300 independent studies published over the past several years. Even though databases are indispensable resources for the scientific community, recent studies have shown that more than 50% of the databases are not regularly updated. In addition, more than 20% of the database links are inactive. To prevent such database and link decay, we have initiated a continuous community annotation project with the active involvement of EV researchers. The EV research community can set a gold standard in data sharing with Vesiclepedia, which could evolve as a primary resource for the field

    An atlas of over 90.000 conserved noncoding sequences provides insight into crucifer regulatory regions

    Get PDF
    Despite the central importance of noncoding DNA to gene regulation and evolution, understanding of the extent of selection on plant noncoding DNA remains limited compared to that of other organisms. Here we report sequencing of genomes from three Brassicaceae species (Leavenworthia alabamica, Sisymbrium irio and Aethionema arabicum) and their joint analysis with six previously sequenced crucifer genomes. Conservation across orthologous bases suggests that at least 17% of the Arabidopsis thaliana genome is under selection, with nearly one-quarter of the sequence under selection lying outside of coding regions. Much of this sequence can be localized to approximately 90,000 conserved noncoding sequences (CNSs) that show evidence of transcriptional and post-transcriptional regulation. Population genomics analyses of two crucifer species, A. thaliana and Capsella grandiflora, confirm that most of the identified CNSs are evolving under medium to strong purifying selection. Overall, these CNSs highlight both similarities and several key differences between the regulatory DNA of plants and other species

    The evolutionary fate of MULE-mediated duplications of host gene fragments in rice

    No full text
    DNA transposons are known to frequently capture duplicated fragments of host genes. The evolutionary impact of this phenomenon depends on how frequently the fragments retain protein-coding function as opposed to becoming pseudogenes. Gene fragment duplication by Mutator-like elements (MULEs) has previously been documented in maize, Arabidopsis, and rice. Here we present a rigorous genome-wide analysis of MULEs in the model plant Oryza sativa (domesticated rice). We identify 8274 MULEs with intact termini and target-site duplications (TSDs) and show that 1337 of them contain duplicated host gene fragments. Through a detailed examination of the 5% of duplicated gene fragments that are transcribed, we demonstrate that virtually all cases contain pseudogenic features such as fragmented conserved protein domains, frameshifts, and premature stop codons. In addition, we show that the distribution of the ratio of nonsynonymous to synonymous amino acid substitution rates for the duplications agrees with the expected distribution for pseudogenes. We conclude that MULE-mediated host gene duplication results in the formation of pseudogenes, not novel functional protein-coding genes; however, the transcribed duplications possess characteristics consistent with a potential role in the regulation of host gene expression

    Abiotic Stress Phenotypes Are Associated with Conserved Genes Derived from Transposable Elements

    No full text
    Plant phenomics offers unique opportunities to accelerate our understanding of gene function and plant response to different environments, and may be particularly useful for studying previously uncharacterized genes. One important type of poorly characterized genes is those derived from transposable elements (TEs), which have departed from a mobility-driven lifestyle to attain new adaptive roles for the host (exapted TEs). We used phenomics approaches, coupled with reverse genetics, to analyze T-DNA insertion mutants of both previously reported and novel protein-coding exapted TEs in the model plant Arabidopsis thaliana. We show that mutations in most of these exapted TEs result in phenotypes, particularly when challenged by abiotic stress. We built statistical multi-dimensional phenotypic profiles and compared them to wild-type and known stress responsive mutant lines for each particular stress condition. We found that these exapted TEs may play roles in responses to phosphate limitation, tolerance to high salt concentration, freezing temperatures, and arsenic toxicity. These results not only experimentally validate a large set of putative functional exapted TEs recently discovered through computational analysis, but also uncover additional novel phenotypes for previously well-characterized exapted TEs in A. thaliana
    corecore